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Amendments to the Claims 

Please cancel Claims 1 and 9 without prejudice. Please amend Claims 2, 3, 7, 8, 10, 1 1, 
1 5 and 16. The Claim Listing below will replace all prior versions of the claims in the 
application: 

Claim Listing 

1 . Cancelled. 

2. (Currently Amended) The method of claim 4- 7 wherein the step of evaluating a match 
between the two records comprises applying the matching process to determine a match score for 
two corresponding fields of the plurality of available fields, the two corresponding fields selected 
from corresponding locations in each of the two records. 

3. (Currently Amended) The method of claim 4- 7 wherein the step of evaluating a match 
between the two records comprises selecting the matching process based on a common data type 
shared by both of two fields of the plurality of available fields accessed in the two records. 

4. (Original) The method of claim 3 wherein when a Boolean matching process is selected, 
the data type of both of the two fields specifies nominal data. 

5. (Original) The method of claim 3 wherein when an ordinal matching process is selected, 
the data type of both of the two fields specifies data capable of being ordered. 

6. (Original) The method of claim 3 wherein, when a vector-based matching process is 
selected, the data type of both of the two fields specifies text data. 

7. (Currently Amended) A method for determining whether records are similar in a database 
containing both structured and unstructured, free-text data, the method comprising the steps of: 
accessing two of the records from the database for evaluation; 
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evaluating a match between the two records as a weighted match between each of a 
plurality of available fields, such that a matching process is selected as appropriate from among a 
group of matching processes including strict Boolean, ordinal, and vector-based matching 
processes, wherein: 

when a strict Boolean matching process is selected, applying a match function as 

an exact match test, 

when an ordinal matching process is selected, applying a match function that 

makes use of information concerning the size and ordering of the data domain, and 

when a vector-based matching process is selected applying a match function that 

uses a vector space frequency test; and 

Th e m e thod of claim 1 wh e r e in th e st e p of e valuating the match b e twe e n th e two 
r e cord s compris e s calculating a similarity score between the two records, as follows: 

sim(record„ record,) = wi*match(ai,-,ai/) + W2*match(a2„a2/) + ... 

w n *match(a w/5 a wy ) 

wherein sim is a similarity function that determines the similarity score for 
the two records[[;]] a 

record/ is a first record of the two records and is identified in the database 
by an iterator 

record, is a second record of the two records and is identified in the 
database by an iterator ,/[[;]] a 

iterator n identifies a field position for a given field a m in the record/ and a 
corresponding field position for a given field a*, in the record/[[;]] a 

match indicates the match fimction[[;]] 4 and 

a symbol w„ indicates a predefined weight for each result of each match 
function. 
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8. (Currently Amended) The method of claim 4- 7 wherein the database is a relational 
database, the records are tuples, and the fields are attributes. 



9. Cancelled. 
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10. (Currently Amended) The data processing system of claim 9 15 wherein the data 
evaluation application is configured to apply the matching process to determine a match score for 
two corresponding fields of the plurality of available fields, the two corresponding fields selected 
from corresponding locations in each of the two records. 

1 1 . (Currently Amended) The data processing system of claim 9 15 wherein the data 
evaluation application is further configured to select the matching process based on a common 
data type shared by both of two fields of the plurality of available fields accessed in the two 
records. 

12. (Original) The data processing system of claim 1 1 wherein when the data evaluation 
application selects a Boolean matching process, the data type of both of the two fields specifies 
nominal data. 

13. (Original) The data processing system of claim 1 1 wherein when the data evaluation 
application selects an ordinal matching process, the data type of both of the two fields specifies 
data capable of being ordered. 

14. (Original) The data processing system of claim 1 1 wherein, when the data evaluation 
application selects a vector-based matching process, the data type of both of the two fields 
specifies text data. 

1 5 . (Currently Amended) A data processing system for determining whether records are 
similar in a database containing both structured and unstructured, free-text data, the data 
processing system comprising: 

a communications interface for communicating with the database; and 

a processor coupled to the communications interface, the processor hosting and executing 

a data evaluation application that is configured to: 

(a) access two of the records from the database for evaluation. 
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(b) evaluate a match between the two records as a weighted match between each of a 

plurality of available fields, such that a matching process is selected as appropriate from among a 
group of matching processes including strict Boolean, ordinal, and vector-based matching 
processes, wherein: 

when a strict Boolean matching process is selected, apply a match function as an 

exact match test, 

when an ordinal matching process is selected, apply a match function that makes 

use of information concerning the size and ordering of the data domain, and 

when a vector-based matching process is selected, apply a match function that 

uses a vector space frequency test; and 

Th e data proc e ssing syst e m of claim 9 wh e r e in the data e valuation application is 
configur e d to (c) calculate a similarity score between the two records, as follows: 
sim(record ; , record,) = wi*match(au,ai/) + W2*match(a2„a2,) + ... 
w n *match(a„ /5 a W7 ) 

wherein sim is a similarity function that determines the similarity score for 
the two records[[;]] A 

record/ is a first record of the two records and is identified in the database 
by an iterator /[[;]]* 

record, is a second record of the two records and is identified in the 
database by an iterator ./'[[;]] a 

iterator n identifies a field position for a given field a m in the record, and a 
corresponding field position for a given field a„, in the recordy[[;]] a 

match indicates the match function[[;]] a and 

a symbol w„ indicates a predefined weight for each result of each match 
function. 

16. (Currently Amended) The data processing system of claim 9 15 wherein the database is a 
relational database, the records are tuples, and the fields are attributes. 



